-
Notifications
You must be signed in to change notification settings - Fork 260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NEON: more fp16 using intrinsics supported by architecture v7 (skip version) #1081
Conversation
9341801
to
e895c43
Compare
SIMDE_CONSTIFY_8_(vget_lane_f16, r, (HEDLEY_UNREACHABLE(), SIMDE_FLOAT16_VALUE(0.0)), lane, v); | ||
SIMDE_CONSTIFY_8_(vgetq_lane_f16, r, (HEDLEY_UNREACHABLE(), SIMDE_FLOAT16_VALUE(0.0)), lane, v); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Odd, the tests already covered lane
values above 3
; why didn't that cause a problem already?
@yyctw Can you take a look at
simde/test/arm/neon/get_lane.c
Lines 505 to 518 in d08d67c
INT8_C( 4), | |
SIMDE_FLOAT16_VALUE(281.00)}, | |
{ { SIMDE_FLOAT16_VALUE( 392.00), SIMDE_FLOAT16_VALUE( -758.50), SIMDE_FLOAT16_VALUE( -870.50), SIMDE_FLOAT16_VALUE( -511.25), | |
SIMDE_FLOAT16_VALUE( 731.50), SIMDE_FLOAT16_VALUE( 345.75), SIMDE_FLOAT16_VALUE( -405.25), SIMDE_FLOAT16_VALUE( -353.75) }, | |
INT8_C( 5), | |
SIMDE_FLOAT16_VALUE(345.75)}, | |
{ { SIMDE_FLOAT16_VALUE( 345.75), SIMDE_FLOAT16_VALUE( 372.75), SIMDE_FLOAT16_VALUE( 802.50), SIMDE_FLOAT16_VALUE( -373.00), | |
SIMDE_FLOAT16_VALUE( 133.12), SIMDE_FLOAT16_VALUE( 928.00), SIMDE_FLOAT16_VALUE( -18.17), SIMDE_FLOAT16_VALUE( -974.50) }, | |
INT8_C( 6), | |
SIMDE_FLOAT16_VALUE(-18.17)}, | |
{ { SIMDE_FLOAT16_VALUE( -634.00), SIMDE_FLOAT16_VALUE( -283.75), SIMDE_FLOAT16_VALUE( -99.50), SIMDE_FLOAT16_VALUE( 134.00), | |
SIMDE_FLOAT16_VALUE( -781.50), SIMDE_FLOAT16_VALUE( 1188.00), SIMDE_FLOAT16_VALUE( -106.88), SIMDE_FLOAT16_VALUE( -497.25) }, | |
INT8_C( 7), | |
SIMDE_FLOAT16_VALUE(-497.25)}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on my previous conclusion [ref], even if the HEDLEY_UNREACHABLE()
is executed during the testing process (when lane > 3
), it will not cause the test to fail.
simde_float16x4x2_t s_ = { { simde_float16x4_from_private(a_[0]), | ||
simde_float16x4_from_private(a_[1]) } }; | ||
return s_; | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the x86 side we would have something like a simde_float16x4x2_private
typedef union
with simde_float16x4
and simde_float16x4_private
fields.
vmulh_lane_f16, vmulh_laneq_f16, vmul_lane_f16, vmul_laneq_f16, vmulq_laneq_f16.
Modified wrong implementation "Ties to Away" to "rounding to nearest with ties to Away" add.h: Remove redundant code.
one ld2_f16, twenty-two ld2_lane series, and twenty-two ld2_dup series.
e895c43
to
d5c2855
Compare
Hi all, this is Eric from Andes Technology Corporation.
This PR is based on the previous PR with certain functions removed to avoid triggering compiler bugs.